278 research outputs found
Reflectance Hashing for Material Recognition
We introduce a novel method for using reflectance to identify materials.
Reflectance offers a unique signature of the material but is challenging to
measure and use for recognizing materials due to its high-dimensionality. In
this work, one-shot reflectance is captured using a unique optical camera
measuring {\it reflectance disks} where the pixel coordinates correspond to
surface viewing angles. The reflectance has class-specific stucture and angular
gradients computed in this reflectance space reveal the material class.
These reflectance disks encode discriminative information for efficient and
accurate material recognition. We introduce a framework called reflectance
hashing that models the reflectance disks with dictionary learning and binary
hashing. We demonstrate the effectiveness of reflectance hashing for material
recognition with a number of real-world materials
Tracking with Local Spatio-Temporal Motion Patterns in Extremely Crowded Scenes
Tracking individuals in extremely crowded scenes is a challenging task, primarily due to the motion and appearance variability produced by the large number of people within the scene. The individual pedestrians, however, collectively form a crowd that exhibits a spatially and temporally structured pattern within the scene. In this paper, we extract this steady-state but dynamically evolving motion of the crowd and leverage it to track individuals in videos of the same scene. We capture the spatial and temporal variations in the crowd’s motion by training a collection of hidden Markov models on the motion patterns within the scene. Using these models, we predict the local spatio-temporal motion patterns that describe the pedestrian movement at each space-time location in the video. Based on these predictions, we hypothesize the target’s movement between frames as it travels through the local space-time volume. In addition, we robustly model the individual’s unique motion and appearance to discern them from surrounding pedestrians. The results show that we may track individuals in scenes that present extreme difficulty to previous techniques. 1
DeepShaRM: Multi-View Shape and Reflectance Map Recovery Under Unknown Lighting
Geometry reconstruction of textureless, non-Lambertian objects under unknown
natural illumination (i.e., in the wild) remains challenging as correspondences
cannot be established and the reflectance cannot be expressed in simple
analytical forms. We derive a novel multi-view method, DeepShaRM, that achieves
state-of-the-art accuracy on this challenging task. Unlike past methods that
formulate this as inverse-rendering, i.e., estimation of reflectance,
illumination, and geometry from images, our key idea is to realize that
reflectance and illumination need not be disentangled and instead estimated as
a compound reflectance map. We introduce a novel deep reflectance map
estimation network that recovers the camera-view reflectance maps from the
surface normals of the current geometry estimate and the input multi-view
images. The network also explicitly estimates per-pixel confidence scores to
handle global light transport effects. A deep shape-from-shading network then
updates the geometry estimate expressed with a signed distance function using
the recovered reflectance maps. By alternating between these two, and, most
important, by bypassing the ill-posed problem of reflectance and illumination
decomposition, the method accurately recovers object geometry in these
challenging settings. Extensive experiments on both synthetic and real-world
data clearly demonstrate its state-of-the-art accuracy.Comment: 3DV 202
DeePoint: Pointing Recognition and Direction Estimation From A Fixed View
In this paper, we realize automatic visual recognition and direction
estimation of pointing. We introduce the first neural pointing understanding
method based on two key contributions. The first is the introduction of a
first-of-its-kind large-scale dataset for pointing recognition and direction
estimation, which we refer to as the DP Dataset. DP Dataset consists of more
than 2 million frames of over 33 people pointing in various styles annotated
for each frame with pointing timings and 3D directions. The second is DeePoint,
a novel deep network model for joint recognition and 3D direction estimation of
pointing. DeePoint is a Transformer-based network which fully leverages the
spatio-temporal coordination of the body parts, not just the hands. Through
extensive experiments, we demonstrate the accuracy and efficiency of DeePoint.
We believe DP Dataset and DeePoint will serve as a sound foundation for visual
human intention understanding
Fooling Polarization-based Vision using Locally Controllable Polarizing Projection
Polarization is a fundamental property of light that encodes abundant
information regarding surface shape, material, illumination and viewing
geometry. The computer vision community has witnessed a blossom of
polarization-based vision applications, such as reflection removal,
shape-from-polarization, transparent object segmentation and color constancy,
partially due to the emergence of single-chip mono/color polarization sensors
that make polarization data acquisition easier than ever. However, is
polarization-based vision vulnerable to adversarial attacks? If so, is that
possible to realize these adversarial attacks in the physical world, without
being perceived by human eyes? In this paper, we warn the community of the
vulnerability of polarization-based vision, which can be more serious than
RGB-based vision. By adapting a commercial LCD projector, we achieve locally
controllable polarizing projection, which is successfully utilized to fool
state-of-the-art polarization-based vision algorithms for glass segmentation
and color constancy. Compared with existing physical attacks on RGB-based
vision, which always suffer from the trade-off between attack efficacy and eye
conceivability, the adversarial attackers based on polarizing projection are
contact-free and visually imperceptible, since naked human eyes can rarely
perceive the difference of viciously manipulated polarizing light and ordinary
illumination. This poses unprecedented risks on polarization-based vision, both
in the monochromatic and trichromatic domain, for which due attentions should
be paid and counter measures be considered
- …